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Abstract 

Decision analytics commonly focuses on the text mining of hnancial news 
sources in order to provide managerial decision support and to predict stock 
market movements. Existing predictive frameworks almost exclusively ap¬ 
ply traditional machine learning methods, whereas recent research indicates 
that traditional machine learning methods are not sufficiently capable of ex¬ 
tracting suitable features and capturing the non-linear nature of complex 
tasks. As a remedy, novel deep learning models aim to overcome this issue 
by extending traditional neural network models with additional hidden lay¬ 
ers. Indeed, deep learning has been shown to outperform traditional methods 
in terms of predictive performance. In this paper, we adapt the novel deep 
learning technique to hnancial decision support. In this instance, we aim to 
predict the direction of stock movements following hnancial disclosures. As 
a result, we show how deep learning can outperform the accuracy of random 
forests as a benchmark for machine learning by 5.66%. 
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1. Introduction 

Organizations are constantly looking for ways to improve their decision¬ 
making processes in core areas, such as marketing, hrm communication, pro- 
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duction and procurement pQ. While the classical approach relies on having 
humans devise simple decision-making rules, modern decision support is pre¬ 
dominantly based on statistical evidence that originates from analyzing data 
la El a El El E], This data-driven decision support was largely triggered by 
the Big Data era PEiinin]. The term Big Data represents the concepts 
of ever-increasing volumes of available data, combined with the rapid devel¬ 
opment of powerful computer hardware, in order to derive decisions from 
complex data. Such data is typically characterized by The Four Fs, namely, 
volume, variety, veracity and velocity - referring to the fact that the data 
collections are massive in size and, in addition, involve different formats of 
data (e. g. video, text), quickly changing data and data that is uncertain 

nn. 

Crucial aspects of data-driven decision support systems entail the predic¬ 
tion of future events, such as consumer behavior or stock market reactions 
to press releases, based on an analysis of historical data EE]. Decision an¬ 
alytics thus frequently utilizes modeling, machine learning and data mining 
techniques from the area of predictive analytics. In fact, predictive analytics 
can be instrumented for “generating new theory, developing new measures, 
comparing competing theories, improving existing theories, assessing the rel¬ 
evance of theories, and assessing the predictability of empirical phenomena” 

lTni. 

Predictive analytics frequently contributes to managerial decision sup¬ 
port, as is the case when predicting investor reaction to press releases and 
hnancial disclosures na. In this instance, predictive analytics is typically 
confronted with massive datasets of heterogeneous and mostly textual con¬ 
tent, while simultaneously outcomes are of high impact for any business. 
Until now, decision support for hnancial news still predominantly relies on a 
traditional machine learning techniques, such as support vector machines or 
decision trees [ISlElllIS]. 

The performance of traditional machine learning algorithms largely de¬ 
pends on the features extracted from underlying data sources, which has 
consequently elicited the development and evaluation of feature engineering 
techniques HI- Research efforts to automate and optimize the feature en¬ 
gineering process, along with a growing awareness of current neurological 
research, has led to the emergence of a new sub-held of machine learning 
research called deep learning |T8]. Deep learning takes into account recent 
knowledge on the way the human brain processes data and thus enhances 
traditional neural networks by a series of hidden layers. This series of hid- 
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den layers allows for deeper knowledge representation, possibly resulting in 
improved predictive performance. Deep learning methods have been applied 
to well-known challenges in the machine learning discipline, such as pattern 
recognition and natural language processing. The corresponding results in¬ 
dicate that deep learning can outperform classical machine learning methods 
(which embody only a shallow knowledge layer) in terms of accuracy [c. f. 
CHI [19]. 

In this paper, we want to unleash the predictive power of deep learning for 
the prediction of stock market movements following a news disclosure. We 
expect that deep learning can learn appropriate features from the underlying 
textual corpus efficiently and thus surpass other state-of-the-art classifiers. 
However, the successful application of deep learning techniques is not an 
easy task; deep learning implicitly performs feature extraction through the 
interplay of different hidden layers, the representation of the textual input 
and the interactions between layers. In order to master this challenge, we 
apply the recursive autoencoder model introduced by Socher et ah [TH] and 
tailor it to the prediction of stock price directions based on the content of 
financial materials. 

The remainder of this paper is structured as follows. First, we provide 
a short overview of related work in which we discuss similar text mining 
approaches and give an overview of relevant deep learning publications. We 
then explain our methodology and highlight the differences between classical 
and deep learning approaches. Finally, we evaluate both approaches using 
hnancial news disclosures and discuss the managerial implications. 

2. Related Work 

This Information Systems research is positioned at the intersection be¬ 
tween finance. Big Data, decision support and predictive analytics. The first 
part of this section discusses traditional approaches of providing decision sup¬ 
port based on financial news. In the second part, we discuss previous work 
that focuses on the novel deep learning approach. 

2.1. Decision Analytics for Financial News 

Text mining of hnancial disclosures represents one of the fundamental 
approaches for decision analytics in the hnance domain. The available work 
can be categorized by the necessary preprocessing steps, the text mining al¬ 
gorithms, the underlying text source (e. g. press releases, hnancial news. 
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tweets) and its focus on facts or opinions (e. g. quarterly reports, ana¬ 
lyst recommendations). While Pang and Lee [16] provide a comprehensive 
domain-independent survey, other overviews concentrate solely on the hnan- 
cial domain [T^ 1^ . In a very recent survey, Nassirtoussi et ah [Tl| focus 
specihcally on studies aimed at stock market prediction. We structure the 
discussion of the related research according to the above categories. 

Among the most popular text mining algorithms are classical machine 
learning algorithms, such as support vector machines, regression algorithms, 
decision trees and Naive Bayes. In addition, neural network models have 
been used more rarely, but are slowly gaining traction, just as in other appli¬ 
cation domains [H]. Furthermore, Bayesian learning can generate domain- 
dependent dictionaries EH- 

As part of preprocessing, the hrst step in most text mining approaches is 
the generation of a set of values that represent relevant textual features, which 
can be used as inputs for the subsequent mining algorithms. This usually 
involves the selection of features based on the raw text sources, some kind of 
dimensionality reduction and the generation of a good feature representation, 
such as binary vectors. A comprehensive discussion of the various techniques 
used for feature engineering can be found in Nassirtoussi et al. H, Pang 
and Lee [T6] . 

The text sources used for text mining include financial news [e. g. [221 [2S| 
and company-specific disclosures, and range from the less formal, such as 
tweets [e. g. EU, to more formal texts, such as corporate hlings m e.g.]. 
Some researchers have focused exclusively on the headlines of news sources 
to exclude the noise usually contained in longer texts [26] . 

News disclosures with a fact-based focus are especially relevant for in¬ 
vestors. As such, German ad hoc news in English contain strictly regulated 
content and a tight logical connection to the stock price, making them an 
intriguing application in research. The measurable effect of ad hoc news on 
abnormal returns on the day of an announcement have been established by 
several authors [c. f. |2Z112HI ESI El] • Consequently, we utilize the same news 
corpus in our following evaluation. 
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2.2. Deep Learning as an Emerging Trend 

Deep learning originally focused on complex tasks, in which datasets 
are usually high-dimensional dzi m\. As such, one of the hrst successful 
deep learning architectures consisted of an autoencoder in combination with 
a Boltzmann machine, where the autoencoder carries out unsupervised pre¬ 
training of the network weights [18]. This model performs well on several 
benchmarks from the machine learning literature, such as image recognition. 
Moreover, its architecture can be adapted to enhance momentum stock trad¬ 
ing m as one of the few successful applications of deep learning to hnancial 
decision support. 

The natural language processing community has only recently started to 
adapt deep learning principles to the specihc requirements of language recog¬ 
nition tasks. For example, Socher et al. [121 utilize a recursive autoencoder to 
predict sentiment labels based on individual movie review sentences. Further 
research improved the results on the same dataset by combining a recursive 
neural tensor network with a sentiment treebank EH. 

3. Methodologies for Financial Decision Snpport 

This section introduces our research framework to provide hnancial deci¬ 
sion support based on news disclosures. In brevity, we introduce a benchmark 
classiher and our deep learning architecture to predict stock movements. Al¬ 
together, Figure [T] illustrates how we compare both prediction algorithms. 
The random forest and the recursive autoencoder are both trained to pre¬ 
dict stock market directions based on the ad hoc announcements and the 
according abnormal returns. To compare the performance of the recursive 
autoencoder to the benchmark, we apply the same test set to each of the 
trained algorithms and measure the predictive performance in terms of ac¬ 
curacy, precision and recall based on the confusion matrix. 


^For details, see reading list “Deep Learning”. Retrieved April 21, 2015 from http; 
//deeplearning.net/reading-list/ 
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Figure 1: Research Framework Comparing Classical Machine Learning and 
Deep Learning 

Following Nassirtoussi et ah [H], we divide the overall procedure into 
steps for data generation, feature selection, feature reduction and feature 
representation. Both approaches, the benchmark algorithm and the recur¬ 
sive autoencoder, differ fundamentally in their preprocessing. The applica¬ 
tion of a random forest requires traditional feature engineering, whereas the 
recursive autoencoder, as a remedy, automatically generates a feature repre¬ 
sentation as part of its optimization algorithm. This is indicated in Figure [T] 
by the extension of the recursive autoencoder box over all preprocessing steps. 

3.1. Benchmark: Predicting Stock Movements with Random Forests 

In a hrst step, one transforms the running text into a matrix represen¬ 
tation, which subsequently works as the input to the actual random forest 
algorithm. First of all, we remove numbers, punctuations and stop words 
from the running text and then split it into tokens [32]. Afterwards, we 
count the frequencies of how often terms occur in each news disclosure, re¬ 
move sparse entries to reduce the dimensionality and store these values in 
a document-term matrix. The document-term matrix then represents the 
features. The actual values are weighted [33] by the term frequency-inverse 
document frequency (tf-idf). This is a common approach in information re¬ 
trieval to adjust the word frequencies by their importance. 

In the following evaluation, we utilize random forests as a benchmark clas¬ 
sifier. Random forests represent one of the most popular machine learning 
algorithms due to their favorable predictive accuracy, relatively low compu¬ 
tational requirements and robustness [M] |35l [36]. Random forests are an 
ensemble learning method for classification and regression, which is based on 
the construction and combination of many de-correlated decision trees. 
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3.2. Deep Learning Architecture: Recursive Autoencoders 

This section describes the underlying architecture of our deep learning 
approach for hnancial disclosures based on so-called autoencoders. The ar¬ 
chitecture of an autoencoder is illustrated in Figure An autoencoder is 
basically an artificial neural network, which finds a lower-dimensional repre¬ 
sentation of input values. Let x G [0,1]'^ denote our input vector, for which 
we seek a lower-dimensional representation y G [0,1]^ with M < N. The 
mapping / between x and y is named encoding function and can be, gen¬ 
erally speaking, any non-linear function, although a common choice is the 
sigmoid function 


f{x) = a(lVx + b) 


1 

1 -F exp {Wx + b)~^ 


with parameters W and b. 


( 1 ) 

The key idea of an autoencoder is to find a second mapping from y to z ^ 
[0,1]^ given by f'{y) = a(W'y + b'), such that z = z is almost equal to the 
input X. Mathematically speaking, we choose the free parameters in / and 
/' by minimizing the difference between the original input vector x and the 
reconstructed vector z. For instance, the weights W and W can be calculated 
via gradient descent. Altogether, the representation y (often called code) is 
a lower-dimensional representation of the input data; it is frequently used as 
input features for subsequent learning data because it only contains the most 
relevant or most discriminating features of the input space. 



Figure 2: An Autoencoder Searches a Mapping Between Input x and a 
Lower-Dimensional Representation y Such That the Reconstructed Value 
2 Is Similar to the Input x. 

The classical autoencoder works merely with a simple vector as input. 
In order to incorporate contextual information, we extend the classical au¬ 
toencoder, resulting in a so-called recursive autoencoder. Here, one trains a 
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sequence of autoencoders, where each not only takes a vector x as input but 
also recursively the lower-dimensional code of the previous autoencoder in the 
sequence. Let us demonstrate this approach with an example as illustrated 
in Figure We process input in the form of a sequence of words. Each word 
is e. g. given by a binary vector with zeros except for a single entry with 1 
representing the current word. Then, we train the hrst autoencoder with the 
input from the hrst two words Company and Ltd. Its lower-dimensional code 
is then input to the second autoencoder together with the vector representa¬ 
tion of the word placing. This recursion proceeds up to the hnal autoencoder, 
which produces as output the code representation for the complete sentence. 
Hence, this recursive approach aims to generate a compact code representa¬ 
tion of a complete sentence while incorporating contextual information in the 
code layer. More precisely, this approach can learn from an ordered sequence 
of words and not only the pure frequencies. In addition, the recursive autoen¬ 
coder entails an intriguing advantage: it can compress large input vectors in 
an unsupervised fashion without the need for class labels. 
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Figure 3: A Recursive Autoencoder Is a Sequence of Autoencoders, Where 
Each Not Only Takes a Vector as Input But Also the Lower-Dimensional 
Code of the Previous Autoencoder in the Sequence 


In a hrst step, the individual words of an input sentence (1) are mapped 
onto vectors of equal length I (2). The initial values of theses vectors are sam¬ 
pled from a Gaussian distribution and later continuously updated through 



























































backpropagation. Through a recursive application of the autoencoder algo¬ 
rithm (3), the complete input sentence is then compressed bit by bit into a 
single code representation of length I (4). The hrst autoencoder generates 
a code representation of length I from the vectors representing the hrst two 
words in the sentence. The second autoencoder takes this code representation 
and the third word vector as inputs, and calculates the code representation 
of the next level. 

In order to extract and predict sentiment values, we use an extended 
variant of the recursive autoencoder model [19] , which includes an additional 
softmax layer in each autoencoder. This softmax function estimates the 
probability that an input vector x belongs to a certain class j & K via 

exp {x^Wj) 

E exp {x'^Wk) 

k=l 

In order to train this supervised model, we optimize the weights Wk of both 
the autoencoders and the softmax layers simultaneously with a combined 
target function. We then utilize the trained weights to classify unknown 
sentences by hrst computing the code representation inside the recursive au¬ 
toencoder and, second, calculating the probabilities for each class from the 
softmax function. Interestingly, the backward mapping /' is needed for train¬ 
ing but is no longer needed for the prediction (that is why black circles indi¬ 
cate the vectors only necessary for prediction in Figures]^ and [^. A detailed 
description of this approach can be found in [T9|; the authors also provide a 
Java implementation which is used for all of the following experiments. 

4. Evaluation: Predicting Stock Market Direction from Financial 

News 

In this section, we discuss and evaluate our experimental setting for pre¬ 
dicting the direction of stock market movements following hnancial disclo¬ 
sures. We start with describing the steps involved in the generation of the 
underlying dataset and then compare classical machine learning with our 
deep learning architecture. 



( 2 ) 
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4.1. Dataset 

Our news corpus originates from regulated ad hoc announcement^ be¬ 
tween January 2004 and the end of June 2011 in English. These announce¬ 
ments conform to German regulations that require each listed company in 
Germany to immediately publish any information with a potentially signifi¬ 
cant effect on the stock price. With their direct relation to a particular stock, 
the tightly controlled content and the measurable effect on the stock price 
on the day of the announcement, ad hoc announcements are particularly 
well-suited for the development and evaluation of techniques for predictive 
analytics. 

Since recursive autoencoders work on sentence tokens, we exclusively use 
the headlines of English ad hoc announcements for the prediction and dis¬ 
card the message body. As previous work [e. g. |2S] has shown, this is not a 
major disadvantage and can even help in reducing noise, as long as the titles 
concisely represent the content of the text. 

We gather the financial data of the releasing companies from Thomson 
Reuters Datastream. We retrieve the firm performance with the help of the 
International Securities Identification Numbers (ISIN) that appear first in 
each of the ad hoc announcements. The stock price data before and on the 
day of the announcement are extracted using the corresponding trading day. 
These are then used to calculate abnormal returns [371 EHl EU; abnormal 
returns can be regarded as some kind of excess return caused by the news 
release. In addition, we remove penny stocks with stock prices below $5 for 
noise reduction. We then label each announcement title with one of three 
return direction classes {up, down or steady), according to the abnormal 
return of the corresponding stock on the announcement day and discard the 
steady samples for noise reduction. 

The resulting dataset consists of 8359 headlines from ad hoc announce¬ 
ments with corresponding class labels up or down. Of this complete dataset, 
we use the samples covering the first 80 % of the timeline as training samples 
and the remaining 20 % as test samples. 

4-2. Preliminary Results 

We can now apply the above methods for predictive analytics to provide 
decision support regarding how investors react upon textual news disclosures. 


^Kindly provided by Deutsche Gesellschaft fiir Ad-Hoc-Publizitat (DGAP). 
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By comparing random forests and recursive autoencoders, we can evaluate 
the following hypothesis. 

The detailed results are listed in Table [T] We compare the predictive per¬ 
formance on the out-of-sample test set in terms of accuracy, precision, recall 
and the Fl-score. The random forest as our benchmark achieves an in-sample 
accuracy of 0.63 and an out-of-sample accuracy of 0.53. In comparison, the 
recursive autoencodei]^ as our deep learning architecture results in an accu¬ 
racy of 0.56. This accounts for a relative improvement of 5.66 %. Similarly, 
the Fl-score increases from 0.52 to 0.56 - a substantial rise of 7.69%. The 
higher accuracy, as well as the improved Fl-score, of the recursive autoen¬ 
coder underlines our initial assumption that deep learning algorithms can 
outperform classical machine learning algorithms. Recursive autoencoders 
have an additional advantage: one can simply inject the complete set of 
news headlines as input without the manual effort of feature engineering. 
The reason for this is that the calculation and optimization of a feature 
representation is integrated into the optimization routines of deep learning 
algorithms. 

The above results comply with the reported hgures of around 60 % with 
the full message body from related work [281139] . In direct comparison to 
the random forest benchmark, our evaluation provides evidence that deep 
learning can outperform classical machine learning in the prediction of stock 
price movements. 


Predictive Analytics Method 

Accuracy 

Precision 

Recall 

Fl-Score 

Random Forest 

0.53 

0.53 

0.51 

0.52 

Recursive Autoencoder 

0.56 

0.56 

0.56 

0.56 

Relative Improvement 

5.66% 

5.66% 

9.80% 

7.69 % 


Table 1: Preliminary Results Evaluating Improvements by Utilizing Deep 
Learning to Predict the Direction of Stock Price Movements Following Fi¬ 
nancial Disclosures 


^We systematically tried several combinations for the two adjustable parameters em¬ 
bedding size (the length I of the mapped feature vectors) and number-of-iterations (i. e. 
number of gradient descent iterations). The best result accounts for an accuracy of 0.56 on 
the test-set, with a vector embedding size of 40 and 70 iterations. As expected, increasing 
the number of iterations usually results in better accuracy on the training set and lower 
accuracy on the test set - a typical indication of over-fitting. 
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Discussion and Implications for Practitioners 

Traditional machine learning techniques still represent the default method 
of choice in predictive analytics. However, recent research indicates that these 
methods insufficiently capture the properties of complex, non-linear prob¬ 
lems. Accordingly, the experiments in this paper show that a deep learning 
algorithm is capable of implicitly generating a favorable knowledge represen¬ 
tation. 

As a recommendation to practitioners, better results are achievable with 
deep learning than with classical methods that rely on explicitly generated 
features. Nevertheless, practitioners must be aware of the complex architec¬ 
ture of deep learning models. This requires both a thorough understanding 
and solid experience in order to use such models efficiently. 

5. Conclusion and Research Outlook 

In the present paper, we show how a novel deep learning algorithm can 
be applied to provide decision support for the hnancial domain. Thereby, we 
contribute to the theory of Information Systems research by shedding light 
on how to exploit deep learning as a recent trend from Big Data analytics 
for managerial decision support. We demonstrate that a recursive autoen¬ 
coder outperforms a classical machine learning method in the prediction of 
stock market movements following hnancial news disclosures. The recur¬ 
sive autoencoder benehts from being able to automatically generate a deep 
knowledge representation. 

In future research, we intend to broaden the preliminary results of this 
Research-in-Progress paper. First, our analysis could beneht substantially 
from incorporating and comparing further algorithms from predictive ana¬ 
lytics. Second, we want to generalize our results by including further news 
sources, such as 8-K hlings. For further evaluations of recursive autoencoders, 
we plan to apply the model to a wider range of tasks, such as exploiting the 
algorithm’s capacity to predict complete class distributions. 

References 

[1] E. Turban, Business Intelligence: A Managerial Approach, 2 ed., Pren¬ 
tice Hall, Boston, MA, 2011. 

[2] C. Apte, B. Liu, E. P. D. Pednault, P. Smyth, Business Applications of 
Data Mining, Communications of the ACM 45 (2002) 49-53. 


12 



[3] D. Arnott, G. Pervan, A Critical Analysis of Decision Support Systems 
Research, Journal of Information Technology 20 (2005) 67-87. 

[4] I. Asadi Someh, G. Shanks, How Business Analytics Systems Provide 
Benehts and Contribute to Firm Performance?, in: 23rd European Con¬ 
ference on Information Systems (ECIS 2015), 2015. 

[5] J. E. Boylan, A. A. Syntetos, Forecasting in Management Science, 
Omega 40 (2012) 681. 

[6] T. H. Davenport, Competing on Analytics, Harvard Business Review 
134 (2006) 98-107. 

[7] K. Vizecky, Data Mining meets Decision Making: A Case Study Per¬ 
spective, in: Americas Conference on Information Systems (AMCIS 
2011), 2011, p. Paper 453. 

[8] D. Boyd, K. Crawford, Criticial Questions for Big Data, Information, 
Communication & Society 15 (2012) 662-679. 

[9] H. Chen, R. H. L. Chiang, V. C. Storey, Business Intelligence and 
Analytics: From Big Data to Big Impact, MIS Quarterly 36 (2012) 
1165-1188. 

[10] F. Halper, The Top 5 Trends in Predictive Analytics, 2011. 


URL: 

http://www.information-management.com/issues/21_ 

,6/ 

the-top-E 

i-trends-in-redictive-an-alytics-10021460-1.html 



[11] D. J. Power, Using ‘Big Data’ for Analytics and Decision Support, 
Journal of Decision Systems 23 (2014) 222-228. 

[12] IBM, The Four V’s of Big Data, 2013. URL: http://www. 
ibmbigdatahub.com/infographic/four-vs-big-data, 

[13] G. Shmueli, O. Koppius, Predictive Analytics in Information Systems 
Research, MIS Quarterly 35 (2011) 553-572. 

[14] A. K. Nassirtoussi, S. Aghabozorgi, T. Y. Wah, D. C. L. Ngo, Text 
Mining for Market Prediction: A Systematic Review, Expert Systems 
with Applications 41 (2014) 7653-7670. 


13 



[15] M. Minev, C. Schommer, T. Grammatikos, News and Stock Markets: A 
Survey on Abnormal Returns and Prediction Models (2012). 

[16] B. Pang, L. Lee, Opininion Mining and Sentiment Analysis, Foundations 
and Trends in Information Retrieval (2008) 1-135. 

[17] I. Arel, D. C. Rose, T. P. Karnowski, Deep Machine Learning: A New 
Frontier in Artificial Intelligence Research, IEEE Computational Intel¬ 
ligence Magazine 5 (2010) 13-18. 

[18] G. E. Hinton, R. R. Salakhutdinov, Reducing the Dimensionality of 
Data with Neural Networks, Science 313 (2006) 504-507. 

[19] R. Socher, J. Pennington, E. Huang, A. Ng, C. Manning, Semisupervised 
Recursive Autoencoder, Proceedings of the Conference on Empirical 
Methods in Natural Language Processing (2011) 151-161. 

[20] M.-A. Mittermayer, G. F. Knolmayer, Text Mining Systems for Market 
Response to News: A Survey: Working Paper, SSRN Electronic Journal 
(2006). 

[21] N. Prollochs, S. Feuerriegel, D. Neumann, Generating Domain-Specihc 
Dictionaries Using Bayesian Learning, in: 23rd European Conference on 
Information Systems (ECIS 2015), 2015. doi:10.2139/ssrn.2522884, 

[22] S. J. Alfano, S. Feuerriegel, D. Neumann, Is News Sentiment More than 
Just Noise?, in: 23rd European Conference on Information Systems 
(ECIS 2015), 2015. doi: 10.2139/ssrn.2520445, 

[23] S. Feuerriegel, D. Neumann, News or Noise? How News Drives Com¬ 
modity Prices, in: Proceedings of the International Conference on In¬ 
formation Systems (ICIS 2013), Association for Information Systems, 
2013. 

[24] J. Bollen, H. Mao, X. Zeng, Twitter Mood Predicts the Stock Market, 
Journal of Computational Science 2 (2011) 1-8. 

[25] J. Muntermann, A. Guettler, Intraday Stock Price Effects of Ad Hoc 
Disclosures: The German Case, Journal of International Financial Mar¬ 
kets, Institutions and Money 17 (2007) 1-24. 


14 


[26] D. Peramunetilleke, R. Wong, Currency Exchange Rate Forecasting 
from News Headlines, Australian Computer Science Communications 
24 (2002) 131-139. 

[27] S. S. Groth, J. Muntermann, An Intraday Market Risk Management 
Approach Based on Textual Analysis, Decision Support Systems 50 
(2011) 680-691. 

[28] M. Hagenau, M. Liebmann, D. Neumann, Automated News Read¬ 
ing: Stock Price Prediction based on Financial News Using Context- 
Capturing Features, Decision Support Systems 55 (2013) 685-697. 

[29] Y. Bengio, Learning Deep Architectures for AI, 2009. URL: http:// 

WWW. iro.umontreal.ca/~lisa/pointeurs/TR1312.pdf, 

[30] L. Takeuchi, L. Yu Ying, Applying Deep Learning to En¬ 
hance Momentum Learning Trading Strategies in Stocks, 

???? URL: http://cs229.stanford.edu/proj2013/ 

TakeuchiLee-ApplyingDeepLearningToEnhanceMomentumTradingStrategiesInStocks. 
pdf. 

[31] R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng, 

C. Potts, Recursive Deep Models for Semantic Compositionality Over 
a Sentiment Treebank, in: Proceedings of the Conference on Empirical 
Methods in Natural Language Processing (EMNLP), volume 1631, 2013. 

[32] C. D. Manning, H. Schiitze, Foundations of Statistical Natural Language 
Processing, 6 ed., MIT Press, Cambridge, MA, 1999. 

[33] G. Salton, E. A. Fox, H. Wu, Extended Boolean Information Retrieval, 

Gommunications of the AGM 26 (1983) 1022-1036. 

[34] L. Breiman, Random Forests, Machine Learning 45 (2001) 5-32. 

[35] T. Hastie, R. Tibshirani, J. H. Friedman, The Elements of Statistical 
Learning: Data Mining, Inference, and Prediction, Springer Series in 
Statistics, 2nd ed ed.. Springer, New York, 2009. 

[36] M. Kuhn, K. Johnson, Applied Predictive Modeling, Springer, New 
York, NY, 2013. 


15 



[37] A. C. MacKinlay, Event Studies in Economics and Finance, Journal of 
Economic Literature 35 (1997) 13-39. 

[38] Y. Konchitchki, D. E. O’Leary, Event Study Methodologies in Informa¬ 
tion Systems Research, International Journal of Accounting Information 
Systems 12 (2011) 99-115. 

[39] S. Groth, J. Muntermann, A Text Mining Approach to Support Intraday 
Financial Decision-Making, in; Americas Conference on Information 
Systems (AMCIS 2008), ???? 


16 



